Decomposability of Translation Metrics for Improved Evaluation and Efficient Algorithms
نویسندگان
چکیده
B is the de facto standard for evaluation and development of statistical machine translation systems. We describe three real-world situations involving comparisons between different versions of the same systems where one can obtain improvements in B scores that are questionable or even absurd. These situations arise because B lacks the property of decomposability, a property which is also computationally convenient for various applications. We propose a very conservative modification to B and a cross between B and word error rate that address these issues while improving correlation with human judgments.
منابع مشابه
The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملPatient Engagement and its Evaluation Tools – Current Challenges and Future Directions; Comment on “Metrics and Evaluation Tools for Patient Engagement in Healthcare Organization- and System-Level Decision-Making: A Systematic Review”
Considering the growing recognition of the importance of patient engagement in healthcare decisions, research and delivery systems, it is important to ensure high quality and efficient patient engagement evaluation tools. In this commentary, we will first highlight the definition and importance of patient engagement. Then we discuss the psychometric properties of the patient engagement evaluati...
متن کاملReview of ranked-based and unranked-based metrics for determining the effectiveness of search engines
Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...
متن کاملA multi-objective resource-constrained optimization of time-cost trade-off problems in scheduling project
This paper presents a multi-objective resource-constrained project scheduling problem with positive and negative cash flows. The net present value (NPV) maximization and making span minimization are this study objectives. And since this problem is considered as complex optimization in NP-Hard context, we present a mathematical model for the given problem and solve three evolutionary algorithms;...
متن کاملSpatial Evaluation of Energy Performance at Neighborhood Scale Case study: Sanandaj city
Climate change has become a challenge with adverse impacts on the Earth. Reducing the use of fossil fuel is a primary step to solve environmental problems. As the population continues to rise, to meet the growing demand for construction with a large share in energy Consumption, Efforts to make the built environment more energy efficient is crucial. The main objective of this research is to eval...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008